-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use less CPU and RAM #549
Use less CPU and RAM #549
Conversation
This helps pipeline scripts to parse them
img2pdf only "staples together" various JPG files, whereas imagemagick loads every JPG into memory, does some kind of transcoding _then_ writes them to a PDF file. This change improves CPU and RAM usage when scanning to PDF files. Refs sbs20#537
9cbd4da
to
42dae0b
Compare
To save me digging through and testing, why does there have to be an additional new line added at the end? Is it the |
I'm not sure about There are tricks to also handle the final line in a loop, such as |
That link was fascinating, thank you. Let me ponder it further. |
First - thank you so much for taking the time to do this and raise such a well documented PR. Second, sorry it's taken until now to reply. I have given this a LOT of thought. And I have decided not to merge the PR. The main problem for me is the additional dependencies of img2pdf. There are users who really don't want additional dependencies and even go so far as to edit the Dockerfile to reduce the image size. It should not stop you from running your code - although you will need to adapt your What I have done is raise #569 which will make it so your pipelines will work easily. What I am also going to do is see about merging in the |
Thanks for your message.
Out of curiosity, what machine are you using scanservjs on? |
It all depends on what the
No idea I'm afraid - I am actually fairly unfamiliar with Tesseract, sorry!
I would gladly link to someone else's image. I am less keen to add and maintain yet another image. I had never intended to support Docker - it adds a lot of support overhead, but somehow did anyway; this project is already placing heavy demands on my time.
Yep - and I may yet include that, I just couldn't find the time to get it done and fully tested today. That's what I meant above with
Indeed. I do always try to do this - but I would welcome contributions. |
Refs #537 .
This PR:
them to a PDF file.
This makes some pipelines slightly faster to process, but this improves RAM usage a lot for most of them when multiple pages are scanned. On older Raspberry Pis, this fixes the "pipeline has stalled because RAM is full" issue.
Disclaimers:
This only refs #537 and does not fix it, because there's still the "transcode every page when the next page is being scanned, instead of transcoding all of them at the end" improvement that could be done one day.
Thanks